Phone dependent modeling of hyperarticulated effects#
نویسندگان
چکیده
In spoken dialogue systems, hyperarticulation occur as an e ect to recover previous recognition errors. It is commonly observed that in particular real users apply similar recovery strategies as in human-human interactions. Previous studies have shown that current speech recognizer cannot handle hyperarticulated speech. As an e ect of higher word error rates at hyperarticulated speech, humans try to reinforce this speaking style which result in even more recognition errors. In this paper, we present approaches to build robust acoustic models for hyperarticulated speech. The key point is that the changes of acoustic features at hyperarticulation is a phone dependent e ect. The idea is to use the likelihood criterion to decide, which phones should be treated separately. This can be done by incorporating dynamic questions about hyperarticulation into the clustering stage. Based on such phonetic decision tree, we can generate appropriate acoustic models. With this method, we achieved a word error reduction about 9% relative at hyperarticulation.
منابع مشابه
Compensating hyperarticulation for automatic speech recognition
This thesis details the effects of hyperarticulation in the context of automatic speech recognition used for human-to-machine interaction. Hyperarticulation can be characterised as a speaking mode exhibiting an exaggerated articulation and occurs as a natural reaction in an effort to resolve recognition errors. Despite the user’s attempt to disambiguate word confusions, hyperarticulation causes...
متن کاملAnalysis and synthesis of hypo- and hyperarticulated speech
This paper focuses on the analysis and synthesis of hypo and hyperarticulated speech in the framework of HMM-based speech synthesis. First of all, a new French database matching our needs was created, which contains three identical sets, pronounced with three different degrees of articulation: neutral, hypo and hyperarticulated speech. On that basis, acoustic and phonetic analyses were performe...
متن کاملCompensating for Hyperartic Articulatory Pro
In spoken dialogue systems, hyperarticulation occur as an effect to recover previous recognition errors. It is commonly observed that users of automatic speech recognition systems apply similar recovery strategies as in human-human interactions. Previous studies have shown that current speech recognizers don’t cover hyperarticulated speech well. As an effect of higher word error rates at hypera...
متن کاملImproved Bayesian Training for Context-Dependent Modeling in Continuous Persian Speech Recognition
Context-dependent modeling is a widely used technique for better phone modeling in continuous speech recognition. While different types of context-dependent models have been used, triphones have been known as the most effective ones. In this paper, a Maximum a Posteriori (MAP) estimation approach has been used to estimate the parameters of the untied triphone model set used in data-driven clust...
متن کاملAllophone-based acoustic modeling for Persian phoneme recognition
Phoneme recognition is one of the fundamental phases of automatic speech recognition. Coarticulation which refers to the integration of sounds, is one of the important obstacles in phoneme recognition. In other words, each phone is influenced and changed by the characteristics of its neighbor phones, and coarticulation is responsible for most of these changes. The idea of modeling the effects o...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000